ENTROPY, STOCHASTIC MATRICES, AND QUANTUM OPERATIONS 



LIN ZHANG 

Abstract. The goal of the present paper is to derive some conditions on saturation of (strong) 
subadditivity inequality for the stochastic matrices. The notion of relative entropy of stochas- 
tic matrices is introduced by mimicking quantum relative entropy. Some properties of this 
concept are listed and the connection between the entropy of the stochastic quantum opera- 
tions and that of stochastic matrices are discussed. 

1. Introduction 

If the column vectors p = [pi,. . .,PnV 6 R-^ and q = [^i, . . . , q^y e are two prob- 
ability distributions, the Shannon entropy of p is defined by H(p) = - loga/';' 
the relative entropy of p and q is defined by H(p||q) = Y!i=i Pi^^Si ^ when p is absolute 
continuously with respect to q, where xlog2 x is set to if x = 0; H(p||q) = +oo otherwise. 

Let B = [bij] hea.N X N bi-stochastic matrix, that is, bij > 0, and Xjli bij = 2>=i ^i; = 1 
for each i,j= l,...,N. Let tt be a permutation of the set {1, . . . , A^^}. For any i,je {1, . . . , A^^}, 
we define ctj = 1 when i = n(j) and Cij = when i nij). Then the matrix C = [c,y] is 
called a permutation matrix. Let S/v be the set of all x permutation matrices and be 
the convex hull Bn of Sn- The well-known Birkhoff-von Neumann theorem indicates that B^ 
is the set of all A'^ x A^ bi-stochastic matrix. 

We only consider finite dimensional complex Hilbert spaces. A state p of quantum system, 
described by Hilbert space "K, is a positive semi-definite matrix of trace one and call it the 
density matrix. The set of all density matrices of "K is denoted by D('K), if p 6 D('K) is 
invertible, then p is said to he faithful. If p and cr are two quantum states, then the von 
Neumann entropy of p is defined by S(p) = -Tr(plog2p), the quantum relative entropy 
between p and cr is defined by S(p||cr) = Tr(p(log2P - log2cr)) if supp(p) c supp(cr); 
S(p||cr) = +0O otherwise, see |I71. 

Let and TC be two Hilbert spaces, L('K, 7C) be the set of all linear operators from "K 
to 'H, denote L('K,'H) by LCH). Let T('H,'K) denote the set of all linear super-operators 
from LCK) to LC7C), similarly, denote TCK,7Y) by TCK). We say that O e TCK,'H) to be 
completely positive (CP) if for each e N, O O 1m,(C) : H'H) ® Mi(C) L^K) O Mk{C) 
is positive, where Mi(C) is the set of aWkxk complex matrices. It follows from the famous 
theorems of Choi O and Kraus ^ that O can be represented in the form O = J^jAdMj, 
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where {M^}']^j c L(7f,7C), that is, 0(X) = 2"=i M^XMJ, X e LCK). Throughout the present 
paper, t means adjoint operation of some operator. Denote by CP('K,7C)(CP('K)) the set of 
all linear CP super-operators in TCH, 7C)(T('K)). 

The so-called quantum operation is just a trace non-increasing O e CP('K, TC), if O is 
trace-preserving, then it is called stochastic; if <I> is stochastic and unit-preserving, then it is 
called bi- stochastic. 

The famous Jamiolkowski isomorphism J : T('K) — > hCH (8) "K) transforms each O e 
TCH) into an operator 7(0) e LCH "K), where 7(0) = O O lL(<K)(vec(l.H) vec(l.H)''")- If 
O e CP('K), then 7(0) is a positive semi-definite operator, in particular, if O is stochastic, 
then ^7(0) is a state on O ^. If O G CP(^) is a stochastic quantum operation, we denote 
the von Neumann entropy S(^7(0)) of ^7(0) by S^^^CO) and call it the map entropyUQj, 
which describes the decoherence induced by the quantum operation O. 

2. On Saturation of Classical Relative Entropy 

In order to obtain the condition for saturation of classical relative entropy, we need the 
following lemmas. 

Lemma 2.1. f@j Let be a Hilbert space, p and cr be two states ofH. If^e CPCH) is 
stochastic, then S(0(p)||0(cr)) < S(p||cr). 

Lemma 2.2. ([4\) Let {Ai, . . . , A^} c L(C") and {Bi,...,Bk} Q L(C'") be two commuting 
families of Hermitian matrices. Then there exist unitary matrices U 6 L(C") and V 6 L(C'") 
such that U^AjU and VBjV^ are diagonal matrices with diagonals ay = [aij, . . . ,anjV <^nd 
by = \bij, . . . , bmj\^ , respectively, for j = I, . . . ,k. Then the following conditions are equiva- 
lent: 

(1 ) There is a super-operator O e CP(C", C") such that 0(Ay) = Bj(j = \,...,k). 

(2) There is an mxn non-negative matrix D = [J^y] such that [bjj] = D[aij]. 
Moreover, if the statement (2) is satisfied, then O is bi-stochastic if and only if D is bi- 
stochastic. 

Theorem 2.3. Let T be a NxN stochastic matrix, p = [pi,P2, ■ ■ ■ ,PnV <^nd q = [^i , • • • . <?a']^ 
be two N-dimensional probability distributions. Then H(rp||rq) < H(p||q). Moreover, for 
each 1 <,k N, pk,qk> 0, then H(rp||rq) = H(p||q) if and only if the following conditions 
hold: 

(i) V = l^kVk ® T/t and q = v^q/: ® Tk, where p^t, q^ denote the mk- dimensional 
probability vectors, and denotes the nj,- dimensional probability vectors, and p.k, vt > 
0,k = \,. . .,K; Zk=i Mk = 2f=i n = 1, Zf=i mtnt = N; 

(ii) T = Tif, ® Tk, TTk e S^^^ and Tj, is n^ x np, stochastic matrix for each k = I,. . .,K. 
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Proof. Let p, cr, p' and cr' be diagonal matrices with diagonal p, q, Tp and Tq, respectively. 
Then it follows from the Lemma [Z2l that there is a stochastic O e CP(C",C'") such that 
0(p) = p', 0(cr) = cr'. Note that H(p||q) = S,{p\\a) and H(rp||rq) = S(p'||(r'), so by 
Lemma 1 we have H(rp||rq) < H(p||q). Moreover, if for each \ 4, k ^ N, pk,qk > 0, 
then the states p, cr, (D(p) = p' and 0(cr) = cr' are faithful. If H(rp||rq) = H(p||q), then 
SipWa) = S(<D(p)||0((r)). Since B 

S(0(p)||0(cr)) = SipWcr) 

if and only if the following statements hold: 

(1) 'H and TC can be decomposed by the form <K = 0^^^ 'H^^'Hj^, 'K = 0^^^ 'K^^'K^, 
where Aim'Hj^ = dimTC^. 

(2) If is the restriction of O to LCH^ ® Kf), then e TCK/- ® "Hf, 'K^ <8) TCf ) and it 
can be factorized into the form (^^ = Adu^. ® O^, where Ut '■ 'Hj^ — > TC^ is unitary 
operator and Of 6 TCK^, "Kf) is stochastic, k= 1,...,K. 

(3) The state p decomposes as p = 0^^^ P/tp[' ® '^f'^ ~ ® f=i ® '^k' where all the 
operators are density operators, and {pk}^=i and {qk]f=i are probability distributions. 

Therefore, it follows that the result can be proved by the above decomposition of O. □ 

Remark 2.4. In [[Bl, it was showed that S(0(p)) = S(p) if and only if o OCp) = p while 
the explicit construction of the state p and the quantum operation O are given. We can employ 
the mentioned result to give an explicit construction for T, p in the identity: H(rp) = H(p). 
The proof is trivially and omitted. 

3. Relative Entropy of Stochastic Matrices 

In this section, the entropy of stochastic matrices is discussed. For the entropy of stochastic 
matrices, more details can be found in [fT2|. We will go deeper within the entropy concerning 
stochastic matrices and derive some conditions on the (strong) additivity for the stochastic 
matrices. The notion of relative entropy of stochastic matrices is introduced by mimicking 
quantum relative entropy. Some properties of this concept are listed and the connection be- 
tween the entropy of the stochastic quantum operations and that of stochastic matrices are 
discussed. 

To be specific, for any N x N stochastic matrix T = [tf^y], the weighted entropy [fT2l of 
r by a probability vector p = [p\, . . . ,pn]^ is defined by Hp(r) = Z^li /'vH(tv), where 
T = [ti, . . . , Ia?] and iy = [tiy, . . . , t^yY is the vth column vector of T. In particular, H(r) = 
^ Sill H(tv) is defined for p = ^[1, . . . , 1]^. 

For any two N xN stochastic matrices A and B, the relative entropy between A and B with 
respect to a probability vector p = [pi, . . .,PnV is defined by Hp(A||5) = Yiv=i PvH(av||by), 
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where Hy and by are the vth column vectors of A and B, respectively. Similarly, H(A||5) = 
i Zy=i H(ay||by) is defined for p = ^[1, . . . , 1]^ 

The following conclusions are immediate. That is, Hp(-) is a nonnegative and concave 
function; Hp(-||-) is a jointly convex function. 

In what follows, the monotonicity of relative entropy of stochastic matrices is obtained. 

Theorem 3.1. IfT,A,B are all N x N stochastic matrices, then 

H^{TA\\TB) < Hp(A||5), 

where p is an N -dimensional probability vector Moreover, if all the components of p are 
positive, then 

H^{TA\\TB) = Hp(A||5) 
if and only if the following conditions hold: 

(i) aj = ^^^i iJL^^^'p^l^^rk andhj = v^-'-'q^-'-'^r/t, where q^-'^ denote mk- dimensional 
probability vectors, and are nu-dimensional probability vectors and 'ik : Vk > 
0, Zf=i = Zf=i = 1 for each j = 1 , . . . , A^, Zf=i m,n, = N; 

(ii) T = 0^^j TTk ® Tk, where Jiic 6 Snif. and Tk is n^^ x n^t stochastic matrix for each k. 

Proof. By the definition of relative entropy for stochastic matrices, it follows that 

N 

Hp(A\\B) = J]pjH(aj\\hj), 

where aj = [aij, a^jY and = [bij, . . . , b^j]^ are 7th columns of A and B, respectively. 
Now 

N N 

H^iTAWTB) = YjPjH(iTA)j\\(TB)j) = Y,pjHiTaj\\Thj) 

N 

< _^PyH(ay||b,) = Hp(A||5). 

Thus it follows from the above process in the proof that when the components of p are all 
positive, Hp(TA\\TB) = Hp(A||5) if and only if H(ra;||rby) = H(ay||by) for each j. By 
Theorem [231 the equality condition can be concluded immediately. □ 

Remark 3.2. Now denote L(*) = [p[^\ . . . , p[^^] and i?^*^' = [q[^', . . . , q[^^]. Let E^^^ = 
Diagljxf, . . . and F^*^^ = Diag[vf , vf\ The explicit forms of A, B can be writ- 
ten as 

A = 0f^j £«L« (8) r, and B = 0^^^ ® r,, 

where L^''^ and R^''^ are any stochastic matrices, where k = I, . . .,K. Furthermore, Yjk=i ^^^^ - 
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For a finite collection {B'^'^} of N x N stochastic matrices, denote B = 2/ ^iB^'\ where {Aj} 
is a probability vector. The x-^iucintity for {B^'^} is defined hy XpiUh 5^'^}) = 2; /l;Hp(5('^||'fi), 
where p is a probability vector. It is easily seen from Theorem 13. II that 



(i) Xp(Ui,B^'^}) = Hp(2:,-l,-5®) - Z,-^,Hp(5«); 

(ii) 2/^.Hp(5®||D) = Xp(Ui,B^'^}) + Hp(5||D), i.e., 2,- ^,Hp(5«||D) = AiH^(B^'^\\B) + 



Hp(5||D), where D is N x N stochastic matrix and p is an A'^-dimensional probability 



(iii) Assume that T is A^xA^ stochastic matrix. Thenxp({Ai, TB^'^}) < XpiUi, 5®)) if and only 
if Hp(TB) - Hp(B) is a convex function in its argument stochastic matrix B; moreover. 



In [[T2l|. W. Slomczyhski obtained that given any NxN stochastic matrices X, Y, Z for which 
p is their common invariant probability vector, i.e. Xp = Fp = Zp = p. Then : 



The following result is to deal with the saturation of the above two inequalities. 

Proposition 3.3. (i) IfT e Bp^, A is N x N stochastic matrix and p is an N -dimensional 
probability vector with all positive components, then Hp(TA) = Hp(A) if and only if 
T^TA = A; 

(ii) IfX = Xi®nR and Y = ni® Y^for X^ being stochastic matrix of size m x m, e S,„, 
Yr being stochastic matrix of size nxn, kr e S„, then H{XY) = H{X) + H(Y); 



(iii) IfX = X\ ®nl,Y= ^l^^ Yl ® Yl and Z = 0,^^^ n\ ® for X{ being stochastic 



matrix of size m^ x m^, e S^;^, Y^ being stochastic matrix of size ni, x n^ e S„^, 
then H(XYZ) + H{Y) = H(XY) + H(YZ). 

Proof, (i) Since each component pj are positive, it follows that Hp(rA) = Hp(A) if and 
only if H{Taj) = H(ay) for every j, where all ay's are the jth column vector of A. By 
the result in BUm, i.e. for B e B^, H(5p) = H(p) if and only if B^B^ = p, we get 
that H(ray) = H(ay) for every j if and only if T^Taj = aj for all j; that is, the proof is 
concluded. 

(ii) Since XY = XltTl <S) ttrYr, it follows that HiXY) = HiXtni <S) ttrYr) = H(Xi) + H(Fk), 
which implies that the conclusion. 



vector. 



;rp(R,r5»})<;rp(U.-,5«})- 



(i) Hp(y) < Hp(Xy) < Hp(X) + Hp(y); 

(ii) Hp(xyz) + Hp(y) < Hp(xy) + Hp(yz). 



(iii) Since 



K 



xyz = ^x^y^4®4y^zi 



rR 

'k ' 



k=\ 
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it follows that 



HiXYZ) 



2^,[H(X[y[) + H(yfzf)], 



k 



H(xy) 



J^i,[H(X[y[) + H(7f)], 



k 



H(yz) 



24[H(y,^) + H(y«zf)], 



it 



H(y) 



J]i,[H(y,^) + H(yf)], 



it 



where /Ij; = rrikrik/N and Yik^k^k = A'^- Combining all these expressions gives the 
desired result. 



4. The Relationship Between Quantum Operations and Bi-stochastic Matrices 

For any CP super-operators O and *F, with corresponding their Kraus representations: O = 
2, AdM, and ¥ = Yjj^^Nj, respectively. It is easily seen that O (8 = Y^i,] ^^Mimp and 
7(0 O *P) = 7(<D) 7(¥); denote = Ad^r- Then 7(0 o ^) = O O 1(7(¥)) = 1 (8) 
¥^(7(0)) = 00 ¥T(vec(l) vec(l)^). 

Let O, ¥ e CP('K) be stochastic, the relative entropy between O and *P is defined by 
S(0||¥) = S(p(0)||p(¥)). If A 6 CPCK) is also bi-stochastic, then S(AoO||Ao^) < S(0||^). 
This can be seen easily from the Lemma [2?T1 Indeed, S(A ® 1lck)(p(^^))I|A. ® 1l('K)(p(*P))) ^ 
S(p(0)||p(¥)) = S(0||*F) since A Ilck) is bi-stochastic whenever A is bi-stochastic. 

Assume that O is a CP stochastic super-operator for which the Kraus decomposition can 
be written as O = Ad^^. Define Kraus matrix [HI for O as 5(0) := Yj/^Tfj • T*, where • 
denotes Shur product of matrices and * means that entry-wise complex conjugate of a matrix. 
Hence the (/, 7)th entry bjj of B can be described by bij = ^j^j, where = [t'^j] and the 
bar means the complex conjugate of complex numbers. 

For any two Hermitian matrices X and Y, X is majorized by Y, denoted by X < y, if there 
is a CP bi-stochastic super-operator O such that X = 0(y). The well-known Shur's theorem 
states that Diag(X) < X for any square matrix X, see Thus for any bi-stochastic quantum 
operation A, it follows that A(p) < p. Moreover, 7(0 oW) < 7(0) and 7(0 o ^) < J(W) for 
any two bi-stochastic quantum operations O and 

In what follows, some properties of Kraus matrix is listed below. 

Proposition 4.1. (i) For a given (bi-)stochastic super-operator O 6 CP^IH), 5(0) is a (bi- 
)stochastic matrix. 

(ii) 5(0) is well-defined, i.e., it is independent of the different Kraus decompositions for O 
and just depends on O itself. 



□ 
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(Hi) 5(0) is a convex function with respect to argument O, i.e., 

5(?Oi + (1 - = tB(Oi) + (1 - t)B{(^2)forany Oi, O2 and all t G [0, 1]. 

(iv) Denote ^{B) = {OlO G TCH) is CP stochastic super-operator and B(0) = B}. Then 
^{B) is a nonempty convex set. 

(v) 5(01 (g) ©2) = B{@i) (g) B{®2)for any stochastic quantum operations ©i and ©2. 

( vi) Assume that H and 'K are M and N dimensional Hilbert spaces, respectively. If A € 
T(9-( TC) is CP and stochastic and can be described by A = ^k^k <2> where 
{<S>k} G T(9i) and {^k) ^ T(%) are two collections of CP stochastic super-operators, 
then 5(A) = ^kB(^k) BQ^^), where A = {Ak}k is a finite probability vector 

Proof, (i) The proof is trivially. 

(ii) Assume that O = AJg^ = 11v=i^<^f^- By the unitary freedom of quantum opera- 
tions, there isaN^ x unitary matrix U = [w^y] such that = Xf=i W/<v^v Then 

Af2 N- N- 

yU=l /Z=l V=l K=l V,(C=1 /i=l 

= J]Sy,F,.F: = J]Fy.F;, 

V,K=l V=l 

which implies that 5(0) is well-defined. 

(iii) Choose any two stochastic quantum operations Oi and 02 with their corresponding 
Kraus decomposition: Oi = ^2 = Zv^^r^- Let t 6 [0, 1]. Then Kraus 
decomposition for ?Oi + (1 - 1)(£>2 is ?Oi + (1 - 0^2 = tZfi'^^s^ + (1 - liy^'^r,, which 
implies that the Kraus matrix for ?Oi + (1 - ?)^2 is 

5(?Oi + (1-002) = J](V^5^).(V?5;) + 2(Vr^r,).(Vr^r;) 

A* V 

= tJ]s,.s; + (i-t)J]T,.T: 

= r5(Oi) + (1 - 05(02). 

(iv) If ^2 e -#(5), it follows from the result of (iii) that 5(?»Fi + (1 - t)'^2) = ?5(^0 + 
(1 - 05(^2) = B since 5(^0 = 5(^2) = B, which impUes t^i + (1 - O^Pi e -^(B). 
The fact that J^{B) is not empty is clearly. 

(v) Let the Kraus decompositions for ©i and ©2 are ©1 = Zm^<^Sm ©2 = Y^f^Adr^,. 
Then ©1 (g) ©2 = Z^,^ AJj^^r^. Now 

5(©ig)©2) = J](5,®r^).(5>r;) = (25,.5;)g)(2r^.r;) 

= 5(0i)(»5(02). 
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(vi) It follows trivially from combining the above conclusions (iii) and (v). 

□ 

Remark 4.2. In the above Proposition 14. IT iv). it is known that M{E) is a nonempty convex 
set. In fact, it is also compact. Thus the question naturally arises: what is the extreme points 
of ^{E)1 Note that in [[8l|, Parthasarathy gave a characterization of extremal quantum states 
of composite systems with fixed marginal states. Subsequently, Rudolph gave an another 
characterization about it in [[TT]|. Therefore, our question can be described in terms of the 
language as in [El [IB under the additional condition that the diagonals of Jamiolkowski state 
is fixed. 

Remark 4.3. Assume that is an A'^-dimensional Hilbert space and O, *F e CPCH) for 
which 

■^(^) = Zm,/i=i Pmii\mn){mix\ and 7(¥) = 2m^=i qm,ii\mix){nni\. 
By the stochasticity, ^JLi Pm^i = 1 and ^Li Pmii = ^\ ZjLi (Imy. = 1 and ^Li -^m^ = 1- Then 
7(0 o VF) = Z:I,^=i[5(0)5(¥)]^Jm;/)<m//|, where 5(0) = \_p^^\ and = [^^^], which 

implies that 

5(0 o ^) = 5((D)5(¥), 5(¥ o O) = 5(¥)5(0) 
and 

S(0 o ^) = H(5(0 o ^)) + log A^, S(¥ o O) = H(5(^ o O)) + log A^. 
Now S^^PCO) + S'"''P(¥) - S^^PCO o »F) = H(5(0)) + H(5(¥)) - H(5(¥)5(cD)) + logA^. 

Generally speaking, 5(0 o *P) ^ 5(0)5(¥) for two stochastic super-operators O, *P e 
CPCK). This fact shows that if both 7(0) and 7(^) are diagonal, then 5(0 o *F) = 5(0)5(¥). 
There is a question which can be formulated as follows: what is a sufficient and necessary 
condition for 5(0 o ^) = 5(0)5(¥) for stochastic super-operators O, ¥ e CPCH). It is 
conjectured that Hp(5(0 o < Hp(5(0)5(¥)) for any stochastic super-operators O, ¥ 6 
CPCH), where p is any A^^-dimensional probability vector. 

Proposition 4.4. Assume that IH is a N -dimensional Hilbert space. 

(i) IfQ>e CPCH) is stochastic, then.-S'^^^iO) < H(5(0)) + log A^; 

(ii) //0,¥ 6 CPCK) is stochastic, then: H(5(0)||5(¥)) < S(0||^); 

Proof, (i) By Shur's lemma, it follows that D?a^(7(0)) < 7(0) which is equivalent to 
Diagipi^)) < Pi^)- Since <m|5(0)|/z) = {mix\J{(^)\mn), it can be seen that 

S'^'^PCO) = S(p(0)) < S{Diag{pm)) = H(5(0)) + logA^. 
Furthermore, S^^^CO) = H(5(0)) -i- logN when O is represented by a diagonal dynam- 
ical matrix 7(0). 
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(ii) There exists a CP bi-stochastic super-operator A such that A(p) = Diag(p) since Diag(p) < 
p which follows from Shur's lemma. Thus it follows from Lemma [XT] that 

S(OII^) = S(p(0)IIpTO > S(Diag[p(m\Diag[pC¥)]) 
1 ^ 

= ^ Z H(5(0)y||5(¥)y) = H{Bm\\BC¥)). 

□ 

Remark 4.5. Let 7(0) = H(5(0)) - S^^^CO) for stochastic super-operator O e CPCK). 
Then for a collection {O^;} of stochastic super-operator in CPCH) such that = 2*: '^k^k, 
Xi{At,B((!>k)}) < xiUk,<^k}) if and only ifJ{l,kAM < ^^t^^); i.e., Jm is a convex 
function in its argument O. 
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